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1,  Introduotlon  , 


The  purpose  of  this  paper  is  to  describe  a Generalized  Reduced 
Gradient  (GRG)  algorithm  for  nonlinear  programming,  its  implementation 
as  a FORTRAN  program  for  solving  small  to  medium  size  problems, 
and  some  computational  resvilts.  Our  focus  is  more  on  the  software 
implementation  of  the  algorithm  than  on  its  mathematical  properties. 

This  is  in  line  with  the  premise  that  robust,  efficient,  easy  to  use  NLP 
software  must  be  written  and  made  accessible  if  nonlinear  programming 
is  to  progress,  both  in  theory  and  in  practice. 

Recently,  there  has  been  increased  emphas?  s on  the  software 
implementation  of  algorithms  in  many  branches  of  mathematical  programming, 
e.g.,  networks  [l],  mixed  integer  programming  [2],  unconstrained  NLP 
[3f  5]  and  some  constrained  NLP  [6,7].  The  earliest  work  on  GKG 

is  by  Abadie  [8,9] ^ whose  efforts  form  a basis  for  this  work.  Abadie 
and  others  [11,  12]  have  written  GRG  codes  which  have  been  disseminated 
to  a limited  extent.  However,  the  detailed  operation  of  these  codes  has 
not  been  described  in  the  literature.  We  present  some  comparisons  vdth 
Abadie *s  most  recent  code  in  Section  6 . 
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2.  Brief  Description  of  GRG  Algorithms 

GRG  algorithms  solve  nonlinear  programs  of  the  form 


minimi  5se 
subject  to 


W^> 


gi<X)  - 0, 

0 <.  gj(X)  <.  ub(n+i), 
«-h(  i ) <.  <.ub(  i ) 


i = 1,  neq 
i = neq  + 1,  m 
i = 1,  n 


(1) 


where  X is  a vector  of  n variables.  The  number  of  equality  constraints, 
neq,  may  be  zero.  The  functions  are  assumed  differentiable. 

There  are  many  possible  GRG  algorithms.  Their  underlying  concepts 
are  described  in  references  |8]  - [lo].  This  paper  briefly  describes 
the  version  currently  implemented  in  our  code. 

The  user  submits  the  problem  in  the  above  form.  It  is 
converted  to  the  following  equality  form  by  adding  slack  variables 


\ . m f * • ♦ 

n+1 

/ 


» X 


n+m 


minimize 
subject  to 


gj(X)  - X„*i  ' 0, 
<^b(i)  <.X.  <.  ub(i), 
Zb(i)  = ub(i)  = 0, 
Hb(l)  = 0 


i = 1,  m 

i = 1,  n+m  (2) 

i - n+1,  n + neq 
i = n + neq  + 1,  n+m 


These  last  two  equations  are  the  bounds  for  the  slack  variables.  The 
variables  X^,  ...,  X^  will  be  called  "natural”  variables. 


2 


Let  X satisfy  the  constraints  of  (l),  and  assume  that  nb  of 
the  constraints  are  binding  (i.e.,  hold  as  equalities)  at  X.  A 
constraint  g^  is  taken  as  binding  if 

|g^  - ub(n+i)l  < EPNEWT 
or  |g^  - £b(n+i)|  < EPNEWT 


i.e.,  if  it  is  vidthin  EPNEWT  of  one  of  its  bounds.  The  tolerance 

EPNEWT  is  one  of  the  most  critical  parameters  in  the  code.  It  can  be 

♦ 

set  by  the  user,  and  has  a default  value  of  lO"*^. 

GRG  uses  the  nb  binding  constraint  equations  to  solve  for  nb 
of  the  natural  variables,  called  the  basic  variables,  in  terms  of  the 
remaining  n-nb  natural  variables  and  the  nb  slacks  associated  with 
the  binding  constraints.^  These  n variables  are  .called  nonbasic.  Let 
y be  the  vector  of  nb  basic  variables  and  x the  vector  of  n nonbasic 
variables,  with  their  values  corresponding  to  X denoted  by  (y,  x). 

Then  the  binding  constraints  can  be  written. 


g(y,x)  = 0 


2 


where  g is  the  vector  of  nb  binding  constraint  functions.  The  basic 
variables  must  be  selected  so  tnat  the  nb  by  nb  basis  matrix 


The  degenerate  case  is  considered  in  Section  5.6. 

? 

"The  definitions  of  g are  extended  here  to  include  the  slacks. 
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B = (3gi/3yj ) 


is  nonsingular  at  X,  Then  the  binding  constraints  (3)  may  be  solved 
(conceptually  at  least)  for  y in  terms  of  x yielding  a function 
y(x),  valid  for  all  (y,x)  sufficiently  near  (y,x).  This  reduces  ohe 
objective  to  a function  of  x only 


and  reduces  the  original  problem  (at  le'aet  in  the  neighborhood  of  (y,x)), 
to  a sin5)l6r  reduced  problem 


minimize 


subject  to  £ £ X <,  u 


where  £ and  u are  the  bound  vectors  for  x.  The  function  F(x) 
is  ca3.1ed  the  reduced  objective  and  its  gradient,  VF(x),  the  reduced 
gradient. 

This  GRG  code  solves  the  original  problem  (1)  by  solving  (perhaps 
only  partially)  a sequence  of  reduced  problems.  The  reduced  problems 
are  solved  by  a gradient  method.  At  a given  iteration  with  nonbasic 
variables  x and  basic  variables  y,  is  computed,  and  VF(x)  is 
evaluated  as  follows: 


.J: 

'i'-f'w 


'X'iiSki 

m 


9F/3Xj^  = ^gmn/^'Sc  - 


A search  direction  9 is  formed  from  VF(x)^  and  a one  dim^sioiial  search 
is  initiated,  whose,  goal  is  to  solve  the  prohlan 


minimi;  7(x  + a3) 
a>0 


This  minimization  is  done  only  approximately,  and  is  accomplished  by  choosing  a 
sequence  of  positive  values  “*  generated  by 

subroutine  SEARCH,  described  in  Section, 5. 3.  For  each  value  a^,  F(x  + aj3) 
must  be  evaxuated.  By  (4),  this  is  equal  to  x + Oj^3), 

so  the  basic  variables  y(x  + 0^3)  must  be  determined.  These  satisfy 
the  system  of  equations 

' g(y,  X + a^3)  = 0 (5) 

where  x,  3,  are  Icnown  and  y is  to  be  found.  This  system  is 
solved  by  a variant  of  Newton’s  method. 

The  one  dimensiorAl  search  can  terminate  in  three  different  ways. 

First,  Newton's  method  may  not  converge.  If  this  occurs  on  the  first 
step,  i.e,,  i = 1 in(5),  then  Oj^is  reduced  and  we  try  again.  Otherwise, 
the  search  is  terminated.^  Second,  if  the  Newton  method  converges,  scmie 
constraints  (which  were  previously  not  binding)  or  basic  variable  bounds  may  be 
violated.  Then  the  code  determines  a new  o value  such  that  at  least 
one  such  new  constraint  or  variable  is  at  its  bound  and  all  others  are 


^For  a discussion  of  alternative  strategies,  see  Section  5.3. 
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satisfied.  If  certain-  conditions  are  met  (see  description  of  subroutine 
SEAJICH/  Section  5. 3 K the  new  constraint  is  added  to  the  set  of  binding 
constraints,  the  one  dimensional  search  is  terminated,  and  solution  of 
a new  reduced  problem  begins.  Finally,  the  search  may  continue  until 
an  objective  value  is  found  which  is  larger  than  the  previous  value.  ‘Ihen  a 
quadratic  is  fit  to  the  three  values  bracketing  the  minimum,  F is  evaluated 
at  the  minimum  of  this  quadratic,  and  the  seiarch  terminates  with  the 
lowest  F value  found.  The  reduced  problem  remains  the  same. 

An  in^jortant  feature  of  this  algorithm  is  its  attanpt  to  return 
to  the  constraint  surface  at  each  step  in  the  one  dimensional  search. 

This  differs  from  earlier  strategies  suggested  by  Abadie  [ 9 1 and  by 
Luenberger  [ 13  ] , which  involve  linear  searches  on  the  tangent  plane 
to  the  constraint  surface  prior  to  returning  to  that  surface.  We  have 
not  experimented  with  such  strategies,  choosing  to  return  to  the  surface 
each  time  because  it  was  sin5)ler  and  we  felt  it  would  lead  to  a more 
reliable  algorithm.  Cpmpxitationa?  experience  presented  later  shows  that 
if  properly  implemented,  this  strategy  can  be  developed  into  an  algorithm 
that  is  both  reliable  and  efficient. 


3.  Program  Structure 

This  code  is  con5>osed  of  a main  program  and  a number  of  subroutines 
written  in  FORTRAN  IV-.  The  main  program  and  subroutines  are  arranged  in  a 
hierarchical  structure , as  shown  in  Figure  1.  (Subroutines  of  minor 
iinportsuice  are  not  shown. ) 


DATAIN 


OUTRES 


CONSBS  /REDGRA 


DIREC 


SEARCH 


'PARSH 


DEGEN 


REDOBJ 


NEWTON 


GCOMP 


Figure  1.  GRG  Main  Program  and  Subroutines 


«JW3<‘'?:^arS3»  V _ . - .j  . 


The  primary  function  of  MAIN  is  to  call  DATAIN  to  read  the  input 

data,  GRG  to  solve  the  problem,  and  OUTRES  tp  print  the  results.  It 

then  tests  a user-specified  flag  to  decide  whether  to  stop  or  to  return 

» 

to  read  additional  data.  It  is  very  simple,  and  is  easily  modified  to 
work  within  a larger  system.  All  subroutines  below  and  including  GRG 
comprise  the  GRG  algorithm  for  iteratively  solving  the  problem  (l)-(2) 
Their  functions  are  briefly  described  in  Table  1.  ' 


ft'- 

^ l^sS; 


w ^ 


CONSBS 


DEGEN 


PARSH 


REDGRA 


DIREC 


SEARCH 


REDOBJ 


l^EWTON 


GCOMP 


Controls  main  iterative  loop.  Computes  Initial  basis 
inverse  and  calls  DIREC  to  con^jute  search  direction. 

Calls  one  dimensional  search  subroutine  SEARCH.  Tests 
for  optimality. 

Computes  Basis  Inverse#  BINV^^. 

Computes  search ‘direction  when  basis  is  degenerate. 

Given  cvirrent  X and  G vectors,  computes  array 
GRAD,  whose  (i#o)  element  is  the  partial 
derivative  of  g.  with  respect  to  X.  . May  be 
user  supplied.  If  not,  there  is  a system  subroutine 
PARSH  which  computes  GRAD  by  forward  difference 
approximation. 

Given  BINV  and  GRAD,  coii5>utes  Lagrange  multiplier 
vector  T,  and  reduced  gradient  of  either  phase  I or 
phase  II  objectives,  GRADF. 

Computes  search  direction. 

Performs  one  dimensional  search. 

Computes  values  of  basic  variables  for  given  values  of 
nonbasics  by  calling  NEWTON.  Takes  action  if  NEWTON 
does  not  converge.  Checks  for  constraint  violations. 

If  any  are  violated,  finds  feasible  point  where  some 
initially  violated  constraint  ic  binding  and  othei^s 
satisfied. 

Uses  Newton’s  Method  to  compute  values  of  basic  variables 
for  given  values  of  nonbasics.  If  convergence  not 
achieved,  sets  flag  and  returns. 

User  supplied  subroutine.  Given  current  X vector, 
computes  vector  of  m+1  function  values  G,  where 
G(l),  ...,  G(m)  are  constraint  function  values  and 
G(m  + 1)  is  the  objective. 

Table  1.  Furictions  of  Major  GRG  Subroutines 
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4.  uoer  Inputs 


The  only  subroutine  which  the  user  must  provide  is  v3C0MP,  which 


computes  the  function  for  a given  vector  X.  First 

derivatives  of  the  functions  g^  are  required,  hut  these  maj  he  computed 
by  a fc'ystem  subroutine.  ^ARSI  using  finite  difference  approximations  (first 


order  for-iard  difi'ereiui’ct/.  Alternatively,  the  user  may  supply  a sub- 


routine PARSr  whicii  compu-.  r first  derivatives  by  analytic  formulas  or 


other  means.  The  finite  diff  r<„ice  spprcxima+ions  have  been  completely 


satisfactory  in  all  problt-i.  ■ srl/ed  r-fc-is  far,  and  eliminate  the  (often 


considerable)  btarden  o'*  coding  PAHSh. 


The  o-‘*'.er  inp  , '’ita  aie  the  pi  •‘b  Ion  dimensions  m,  n,  neq,  the 


bounds  ubj  and  , Jnitia  valu^^i  fo-  X,  and  (optionally)  a list 


of  vuniabbio  to  he  i • th,  "Itia’'  Vao.s  if  that  is  possible.  If  the  initial 


values  of  X not  sat.bi,  ur-'  bounds  (Z),  an  error  message  ?s  printed 


and  the  program  s-cops,  The^s  is  no  conceptual  difficulty  in  augmenting 


the  phase  I pxocedvire  tc  deal  with  bound  violations,  and  future  versions 


wilJ.  i;iclude  "iihis  f -ature.  All  daxa  ai>e  checked  for  obvious  errors 


(e,g. , ub^  < ib^)  and  are  printed  for  inspection.  In  addition  to  the 


above  mentioned  data,  the  user  may  specify  a number  of  control  and 


tolerance  parameters  '"to  be  H-'.scussed  later),  or  may  leave  them  at 


.jeir  default  values. 


'--i  'fell- 


‘t4s‘  * 

fijll  “S I 


5,  Detailed  Algorithmic  Structure 

The  flow  charts  in  this  section  are  in  aggregated  form.  Their 
purpose  is  to  describe  overall  program  logic.  However,  they  correspond 
closely  to  the  actual  FORTRAN  code. 


5*1  Subroutine  GRG 

This  is  the  controlliog  subroutine  for  the  iterative  process.  It  starts 

by  calling  GCOMP  to  evaluate  the  initial  constraint  and  objective  values. 

If  any  constraints  are  violated,  a phase  I procedure  is  entered  in  which  the 

objective  is  the  sum  of  absolute  values  of  constraint  violations.  The  array  ICAND 

in  block  1 is  used  by  CONSBS  in  choosing  basic  columns.  GONSBS  will 

attempt  to  enter  columns  in  ICAND  into  the  basis  first  (in  the  order 

listed)  before  trying  other  columns.  If  the  user  has  not  specified  an 
initial  ICAND,  block  1 sets  it  to  the  index  set  {1,  ..,  n). 

The  current  X is  considered  optimal  if  either  of  two  tests  is 

met  in  block  2,  The  first  test  checks  if  the  following  conditions  are  met: 
for  i = l,n  but  x(i)  not  a slack  variable  for  an  equality 

constraint 


x(i)  = lb  (i)  =*=c> 

x(i)  = ub  (i)  =*v> 

ib(i)  < x(i)  < ub(i)  =■=> 


GRADF(i)  i -EPSTOP 
GRADFU^  <.  EPSTOP 
IcRADFli)!  < EPSTOP 


The  quantities  x(i)  are  the  current  nonbasic  variables  and  GRADF(i') 


11 


is  the  component  of  the  reduced  gradient.  Thir.  tests  whether  ■ 

the  Kuhn- Tucker  optimality  conditions  are  satisfied  to  within  EPSTOP, 

a small  positive  num’  which  can  be  controlled  by  the  user,  with 

-U  / 

default  value  10  . The  alack  variables  for  equality  constraints  (i.e., 

the  variable  X(n  + l)  to  X(h  + »^eq))  are  excluded  from  the  test 

because  they  must  be  zero  in  any  feasible  solution.  The  second  optimality 

test  checks  if  the  condition 

ABS(FM  - OBJTST)  < EPSTOP  * ABS(OBJTST) 

is  satisfied  for  NSTOP  consecutive  iterations.  In  the  above,  FM 
is  the  current  objective  value  and  OBJTST  is  the  objective  VEilue  at 
the  start  of  the  previoiis  one  dimensional  search,  NSTOP  has  a default 
value  of  3. 

If  CONSBS  constructs  a degenerate  basis  then  it  also  conq)Utes 
a useable  feasible  search  direction.  Hence,  DIREC  need  not  be  called 
in  block  5.  Otherwise,  DIREC  generates  a search  direction  and,  in 
either  case,  SEARCH  is  called  to  find  the  best  point  in  the  given 


direction. 


Figure  2.  Subroutine  GRG 


Start 


Call  GGOMP 


Specify  array  ICAND. 
compute  violated  constraint  indices 


Call  CONSBS 


Call  REDGRA 


Current  poinv 
optimal 


Return 


If  not  degenerate,  call  DIREC 


Call  SEARCH 


Test  for  Return 


ine  search  ended  Yrith  quadrat 
interpolation  and  no  binding 
cpnstraints  strictly  satisfie 


ICAND- 

IBV 

ISET  — 

— 0 

The  test  in  block  4 retxirns  if  SEARCH  has  halved  the  initial 


step  size  10  times  without  finding  a lower  objective  value  and  if  d = -7F. 


A small  fraction  of  problems  have  terminated  in  this  way,  always  with 


objective  values  close  to  optimality.  In  block  5,  the  Y branch 


indicates  the  case  where  the  basis  If  unchanged  and  the  line  search 


terminated  as  in  an  xinconstrained  probi.em.  Then  ICAMD  is  set  to  the 


previous  basic  variable  list,  IBV,  so  that  CONSBS  will  again  choose 


these  as  basic  variables,  if  possible.  This  is  desirable,  because  of 


the  use  oi  -,ne  BF£  ■>.rr  - table  metric  algorithm  [ l^i-  ] in  DIREC  to  choose 


the  seaTv..'  ,(i.r3ction,  d.  This  algorithm  uses  an  approximation  to 


(v'^  F(x  ))"■*■  to  generate  d,  and  the  approximation  is  not  valid  unless 


the  reduced  problem  remains  the  same.  If  the  reduced  problem  changes 


(N  branch,  block  5),  ICAND  is  set  to  {1,  ...,  N}  to  give  CONSBS  freedom 


to  construct  as  well  conditioned  a basis  as  possible.  Setting  ISET  to 


1 forces  DIREC  to  reset  the  BFS  algorithm  so  that  d = -VF.  Testing 


for  strictly  satisfied  constraints  in  block  5 permits  reducing  the  size 


of  the  basis  by  dropping  these  from  the  binding  set. 


In  most  problems  solved  thus  far,  initial  iterations  end  with  a 


new  constraint  being  added  to  or  deleted  from  the  basis  or  with  a change 


in  basic  v;.  ^ables.  The  code  is  finding  the  set  of  constraints  which 


are  binding  at  optimality,  and  a compatible  set  of  basic  variables.  Once 


this  is  done,  the  BFS  algorithm  takes  over,  and  the  code  operates  in  an 


"vinconstralned"  mode  until  the  optimum  is  reached.  Most  of  the  improvement 


in  the  objective  and  most  of  the  computations  occur  in  the  first  phase 


of  this  process,  so  it  must  be  done  efficiently.  The  critical  operations 


‘•'-f 


here  involve  violating  one  or  more  non-binding  consxraints  during  the 
line  search,  and  finding  a new  feasible  point  where  one  (or  more)  of 
the  violated  constraints  is  binding.  This  latter  operation  occurs  in 
subi-v^utine  REDOBJ. 

5.2  Subroutine  DIREC 

This  subroutine  computes  a search  direction,  d.  Currently  d 
is  computed  using  tVie  Bioyden-Fletcher-Shanno  (BFS)  variable  metric 
algorithm  [l4]-  [l6],  modified  to  accomodate  upper  and  lower  bounds  on 
the  variables  as  suggested  by  Goldfarb  [l7].  An  n x n matrix  H,  which 
is  updated  after  each  step,  is  used  to  determine  d as 

d = -H  * GRADF  (6) 

where  GRADF  is  the  reduced  gradient.  If  no  one-dimensional 
searches  have  been  performed  or  subroutine  GHG  has  set  ISET  to  1,  then 
d is  taken  as  the  negative  reduced  gradient  direction  and  H is  reset 
to  an  identity  matrix  with  zero  diagonal  elements  for  nonbasic  variables 
at  bounds.  H is  updated  only  if  a test  is  passed  (see  [l6])  indicating 
that  the  new  H will  be  positive  definite.  Numerical  error  may  invalidate 
this  test,  so  the  directional  derivative  r = VP  d is  tested  after 
applying  (6);  if  r > 0,  H is  reset  as  described  above. 


The  BPS  method  was  chosen  because  computations  in  [l4]  - [l6] 
and  elsewhere  show  it  to  be  one  of  the  most  efficient  and  reliable  of 
the  variable  metric  algorithms.  In  the  absence  of  rounding  error,  it 
minimizes  a quadratic  fimction  of  n variables  in  at  most  n one 
dimensional  searches.  Hence,  this  GEG  algorithm  is  finite  for  quadratic 
programs . 

The  last  function  of  DIREC  is  to  test  whether  nonbasic  variables 

currently  at  bounds  should  be  allowed  to  move  away  from  these  bounds. 

This  is  done  as  follows:  the  signs  of  the  reduced  gradient  components  of 

nonbasic  variables  at  bounds  are  checked.  Among  all  variables  at  lower 

(upper)  bounds  whose  reduced  gradient  components  are  negative  (positive), 

the  gradient  component  of  largest  absolute  value  is  chosen.  If  this  value 

is  denoted  by  MAXMULT  and  the  corresponding  index  is  IDROP,  then  variable 

IDROP  is  released  from  its  bound  if 

N . 5 

I (dU))*^  < (MAXMULT  )V4 
i«l 

This  test, prescribed  in  [ IT],  is  designed  to  prevent  "zigzagging",  i.e., 
variables  continxially  coming  on  and  off  their  boxmds. 

Since  H is  a symmetric  matrix,  only  the  diagonal  and  super-diagonal 
elements  are  needed,  sind  these  are  stored  in  a linear  array  H in  row 
order.  Storage  requirements  can  be  further  reduced  by  noting  that  H 
has  zero  rows  and  columns  corresponding  to  nonbasic  variables  at  their 
bounds.  This  feature,  not  yet  exploited,  would  significantly  reduce 
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con5>utation  time  and  storage  requirements  when  most  nonb-isic  variables 
are  at  bounds.  An  extension  of  this  idea  to  general  linear  constraints 
has  been  developed  by  Prof.  A . Buckley  [l8],  who  pointed  out  its  inplica- 
tions  to  us.  Of  course,  H can  be  eliminated  entirely  by  use  of  the 
Conjugate  Gradient  Algorithm  [ 1$  ] , which  is  also  finite  for  quadi'atic 
functions.  Its  use  will  be  examined  in  future  GRG  codes  designed  for 
large  scale  problems. 

Localizing  all  computation  of  the  search  direction  in  a single 
subroutine  provides  a great  deal  of  flexibility.  Other  direction  finding 
procedures  can  be  implemented  simply  by  coding  a new  DIREC,  leaving  tne 
rest  of  the  system  vinchanged. 


i m 
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5.5  Subroutine  SEARCH 

Subroutine  DIREC  provides  search  directions  for 
subroutine  SEARCH,  in  which  the  variables  of  the  problem  are 
assigned  new  values.  This  subroutine  finds  a first  local  minimum  for  the 


problem 


minimize  F(x  + ad) 
a 


The  direction  d is  always  a direction  of  descent,  i.e.. 


d^(x)  < 0 


This  subroutine  searches  for  three  a values,  A,  B and  C,  which 


satisfy 


0 < A < B < C 


F(x  + Ad)  > P(x  + Bd)  < P(x  + Cd)  . 


Then  the  interval  [A,C]  contains  a local  minimum  of  P(x  + ad). 

In  block  11  of  Figure  3,  a quadratic  in  a is  passed  through  A,  B, 
and  C,  with  its  minimum  at  D.  The  best  point,  B or  D,  is  taken  as  an 
estimate  of  the  optimal  a and  a return  is  made. 

In  finding  (A,B,C)  the  choice  of  initial  step  size,  B, 

(block  1),  is  important.  With  the  BFS  or  other  variable  metric  methods, 

B is  set  equal  to  the  optimal  a value  fiom  the  previous  search  except 
\dien  this  causes  too  large  a change  in  the  variables.  The  theoretical 


basis  for  this  is  that,  as  a variable  metric  method  converges,  the 
optimal  a values  should  converge  to  1,  the  optimal  step  size  for 
Newton's  Method.  Hence,  the  previous  optimal  step  is  a good  approxi- 
mation to  the  current  one.  This  must  be  modified  when  the  method 
is  restarted,  for  example,  when  a new  constraint  is  encoiintered  or  the 
basis  is  changed,  since  then  an  optimal  step  much  less  than  unity  is 
generally  taken.  Hence,  we  require  that  the  change  in  any  noribasic 
variable  larger  than  1.0  in  absolute  value  not  exceed  .05  times  its 
value,  while  the  change  in  any  variable  smaller  than  1.0  in  absolute 
value  cannot  exceed  0.05.  If  the  largest  a value  meeting  these  con- 
ditions is  a^,  and  is  the  step  size  found  by  SEABCH  at  iteration  i-1. 


B = min(a^_^j^,a^) 


if  the  previous  search  teiminated  with  an  interpolation,  and  B = a 
otherwise. 

The  subroutine  operates  in  two  phases,  halving  the  initial  step 
size  (if  necessary)  until  an  improved  point  is  found  (loop  2 - 5 - ^ - 5)> 
and  doubling  the  step  size  until  the  minimum  is  bracketed 
(loop  7-8-9-10).  In  each  phase  the  nonbasic  variables  are  changed, 
and  subroutine  REDOBJ  is  called  to  compute  the  reduced  objective 
F(x  + ctd)  (blocks  2 and  8).  If,  in  the  doubling  phase,  the  minimum  is 
not  bracketed,  the  A and  B points  are  replaced  by  the  B and  C 
points,  C is  replaced  by  2B  (block  10),  and  the  new  F value  is  computed. 
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The  only  blocks  idiich  would  not  appear  in  a line  search  for 
unconstrained  problems  axe  3,  6,  If  and  9.  These  deal  with  two 
occurrences:  (a)  In  attemptiiig  to  evaluate  the  basic  variables  in 

subroutine  BEDOBJ  by  Newton's  method,  Newton  may  not  converge,  and 
(b)  REDOBJ  may  produce  a point  at  which  a previously  loose  constraint 
is  binding.  In  block  5,  the  step  size  is  halved  if  Newton  does  not 
converge.  Theoretically,  convergence  must  ultimately  occur;  in  practice, 
if  the  loop  2-3-^-5is  traversed  10  times,  a return  to  GRG  is  made. 
Blocks  6 and  9 terminate  the  search  if  case  (b)  occurs,  while  9 also 
teimiinates  under  case  (a).  Block  7 terminates  the  search  if  more  than 
6 Newton  iterations  were  required  the  last  time  the  basic  variables  were 
evaluated  in  subroutine  NEWTON.  Experience  has  shown  that  the  next 
NEWPON  call  usually  will  not  converge. 

Before  beginning  the  line  search,  SEARCH  computes  the  largest 
a,  a,  such  that  x + od  satisfies  the  nonbasic  variable,  bounds.  The 
logic  required  to  insure  that  a does  not  violate  this  bound,  and  to 
detect  the  case  where  a is  optimal,  has  been  omitted  from  the  flow  chart 
for  simplicity. 

The  strategy  of  terminating  the  search  if  an  improved  point  has 
already  been  found  and  Newton  did  not  converge  or  took  too  many  iterations 


was  not  adopted  initially.  Earlier  versions  of  the  code  attempted  to 
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push  on  with  the  line  search  by  recomputing  B at  the  last  feasible 
point  found  and  cutting  the  step  taken  to  l/5  its  previous  value. 

The  current  strategy  has  been  found  to  be  much  superior.  Now  b"^ 
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is  coiqputed  only  once.,  at  the  beginning  of  each  one  dimensional  search, 

idiere  it  is  needed  acyvays  to  compute  VF.  In  a set  of 

six  test  problems,  the  current  strategy  required  about  half  the  function 

and  gradient  evaluations  of  the  older  one  (see  [20^  for  details), 

and  also  significantly  reduced  the  number  of  times  b”^  is  evaluated. 

This  strategy,  plus  the  rather  conservative  choice  of  initial  step  size, 
plays  a large  part  in  making  an  algorithm  vhlch  returns  to  the  constraint 
surface  each  time  efficient. 

Recently,  much  interest  has  been  expressed  in  "step  size" 
algorithms  to  replace  the  line  search.  These  attempt  to  obtain 
only  a "sufficient"  decrease  in  the  objective,  rather  than  to  find  its 
minimum.  We  have  opted  for  a "sloppy"  atteopt  o find  the  minimum 
because  of  its  " inslde-oxit"  natvire,  i.e.,  a small  initial  step  is  taken, 
then  increased  if  necessary.  This  approach  is  more  likely  to  remain  within 
the  radius  of  convergence  of  Ne^rton’s  method#  viz.,  that,  range  of  a values 
for  which  Newton  will  converge,  with  evaluated  only  at  a - 0, 

This  radius  imposes  an  upper  bound  on  a of  unknown  value.  Most  step- 
size  strategies  choose  large  a values  first  (e.g.,  a = 1 in  [22]), 
then  reduce  them  if  necesseury,  and  hence  are  more  likely  to  exceed 
this  bound.  The  fact  that  the  number  of  Newton  iterations  required  to 
evaluate  F(x  + ad)  increases  as  a increases  also  works  against 
"outside-in"  strategies.  These  same  comments  also  apply  to  algorithms 
which  do  a one  dimensional  search  on  the  tangent  plane  to  the  constraints 
prior  to  returning  to  the  constraint  surface,  as  in  [ll]-[15l. 


5.4  Subroutine  REDOBJ 

This  subroutine  evaluates  the  reduced  ob;jective  function  P(x  + ad) 
for  given  x,  a,  and  d.  It  does  so  by  attempting  to  solve  the  system 
of  nb  (possibly  nonlinear)  equations 


g^(y,  X + ad)  =0 


i € IBC 


for  the  nb  basic  variables  y,  where  IBC  i the  index  set  of  binding  constraints. 
As  in  [ 8 ] and  [ 9 ] , this  is  accomplished  in  subroutine  NEWTON,  using 
the  pseudo-Newton  algorithm 


^b+1  ""  ^t  " t = 0,  1,  2,  ...  (8) 


m 


where  gg  is  the  vector  of  binding  constraints.  The  algorithm  is  celled 
pseudo-Newton  because  is  evaluated  once  at  the  initial  point  of 

the  search,  5C,  instead  of  being  reevaluated  at  each  step  of  the  algorithm, 
as  in  the  standeurd  Newton  method. 

An  initial  estime^e  of  the  basic  variables,  y^,  is  confuted  either 
by  linear  or  quadratic  extrapolation  in  block  1 of  Figure  4.  As  in  [8  ]> 
the  linear  extrapolation  uses  the  tangent  vector 


V = fi 


(9) 


In  our  code,  v is  computed  at  X.  It  is  used  to  find  initial  values, 
y^,  by  the  formula 


% 
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Figure  4.  Subroutine  REDOBJ 
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Yq  = y + 0=1^ 

Using  these  initial  values,  Newton  finds  the  feasible  point  Xj^.  Then, 
at  Xj^,  V is  not  recomputed.  The  old  v is  used,  but  ananating  now 
from  Xj^,  to  yield  the  next  Sv't  of  initial  values  as 

Vq  = Xl  + («2  - «i)v 


Using  these,  Newton  finds  a new  point  X^.  -This  procedure  is  repeated 
until  the  one  dimensional  search  is  over. 


Quadratic  extrapolation  may  also  be  used  to  obtain  initial 
estimates  of  the  basic  variables,  at  the  users  option.  Initially, 
linear  extrapolation  is  used.  After  the  first  feasible  point,  X^^,  is 
found,  qiiadratic  functions  are  fit  to  the  value  and  slope  at  a = 0,  and 
the  value  at  of  each  basic  variable.  These  arc  used  to  predict  their 
values  et  a = a^.  Subsequent  quadratics  are  fit  to  the  values  of  the 
basic  variables  at  a^,  ^±-2  Computational  experience 

with  these  options  appears  in  Section  6.4. 

In  blocks  2 and  8 , Newton  is  considered  to  have  converged  if 


the  condition 


NOEMG  = max  | g (X^ ) 1 < EPNEWT 
i e IBC  ^ •' 


1A 


I ■ 
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is  met  within  ITLT.M  iterations.  Currenfiy  EFJSBWT  = 10“  and  ITLIM  = 10. 


If  IDRKG  has  not  decreased  from  its  previous  value  (or  the  above  condition 


is  not  met  in  10  iterations)  Newton  has  not  converged. 


Once  Newton  has  converged,  possible  constraint  violations  must  be 


checked  (block  5).  There  are  several  reasons  vi'-y  the  current  step 


C'  may  be  too  large; 


(l)  A strictly  satisfied  constraint  may  hf-ve  violate  i an  upper 


or  lower  bound. 


(2)  A constraint  in  IA3X)VE,  the  set  of  constraints  initially 


violating  their  upper  bounds,  may  violate  a lower  bound. 


(3)  A constraint  in  IBELOW  may  violate  an  upper  bound, 

(4)  A basic  variable  may  violate  a lower  or  upper  bound. 


V ary  of  these  cases  hold,  a is  r<^iuced  bo  a value  ot* 


where  no  constraints  are  violated  and  at  least  one  new  constraint  is  equal 


to  a hound.  To  determine  this  constraint,  an  estimate  is  made  of 


in  Mock  3*  Linear  interpolation  between  the  current  and  previous 


value  i of  the  violated  constiaint  is  used. 


The  next  step  determines  whether  case  4 or  one  of  cases  1 - 5 is 


to  be  dealt  with>  according  to  which  has  the  sms-llest  linear  estimate 


of  a . Acaumlng  cases  1 - 5 as  an  example,  w?  then  wish  to  solve  the 


system 


g^(y(a)^  X + od)  - 0, 


i e IBC 


gj^(y(a)j  X + ad)  = 0 


■where  a is  a new  variable  and  = 0 a>  new  equation.  The  Jacobian 
for’ this  system  is 


J = 


where 

d = 

w = (^g^/^x)^d 

and  c is  an  nb  component  column  vector  whose  elements  are 

for  i c IBC.  The  border  vectors  c^  d^  and  the  scalar 
w axe  computed  in  block  5.  Until  recently,  this  block  also  computed 
j”^  by  the  bordered  inverse  formula  (using  the  known  b"^),  and  this 
j“^  was  used  by  subroutine  HEOTON  in  block  7 to  solve  the  augmented 
system  (lo).  However,  this  has  the  disadvantages  that  the  old  is 
erased,  and  it  would  not  be  suitable  for  a large  scale  GRG  implementation 
since  it  involves  changing  the  dimension  of  the  inverse.  However, 
inspection  of  ( 8 ) shows  that  j"^  is  not  required,  only  the  product 
of  j"^  with  the  vector  (gg,gj^).  This  product  can  be  expressed  in 
terms  of  b"^  and  the  border  elements  c,  d,  w using  the  bordered 
inverse  formula.  This  is  done  in  subroutine  HEWTON,  which  tests  to 
see  if  the  system  ( 7 ) or  the  augmented  system  (lO ) is  fco  be  solved, 
and  computes  the  appropriate  matrix-vector  product. 


The  basis  change  procedure  here,  based  on  solving  (lO ) and 
then  calling  CONSBS,  differs  from  that  of  Abadie  in  [ 8 ] and  [ 9 ] < 
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In  Abadie's  procedure^  the  basic  variables  in  ( 8 ) are  checked 
for  each  t.  If  any  violate  their  bounds,  one  is  selected  to  leave  the 


basis,  an  entering  variable  is  chosen,  and  a pivot  operation  updates 
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B . The  Newton  iteration  (8  ) then  continues.  This  could  lead 
to  "false"  basis  changes,  since  violation  of  a bound  during  the  iterative 
process  does  not  imply  violation  when  the  process  converges.  Newton's 
method  need  not  converge  componentwise  monotonically.  Failure  to  con- 
verge could  also  lead  to  a false  basis  change.  In  addition,  if  the 
(Abadie)  basis  change  is  accomplished,  the  new  point  may  have  an  objective 
value  larger  than  P(x),  which  makes  proof  of  convergence  unlikely. 

Our  procedure  avoids  both  these  objections,  perhaps  at  the  expense  of 
slightly  more  computation.  We  feel  that  the  (presumed)  increase  in 
reliability  makes  the  tradeoff  worthwhile. 


5.5  Subroutine  CONSBS 


This  subroutine  selects  a set  of  basic  variables  and  computes  the 
,-l 


basis  inverse,  B . Its  input  is  a list  of  indices  of  variables,  the 
candidate  list  ICAND.  The  outputs  of  CONSBS  include  (a)  a new  list  of 
binding  and  strictly  satisfied  constraint  indices  (b)  a new  list  of 
basic  and  nonbasic  variable  indices  and  (c)  the  new  basis  inverse,  B 

The  call  to  PARSH  in  block  1 of  Figure  5 computes  the  gradients  of 
the  objective  and  all  constraints  at  the  current  point.  The  gradients  of 


the  binding  constraints  are  stored  in  an  array,  which  is  where  the  pivoting 
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to  compute  B is  done. 


I 


start 


Determine  indices  of  binding 
and  loose  constraints 

,,  7X.  ,o 

CALL  PARSH 


Select  candidate  columns 
for  basis 


any  candidates 


Select  slack  or  other 
variable  at  bound  to 
enter  basis . Choose 
pivot  row 


Choose  candidate  column 
with  largest  norm  as 
pivot  column 


Choose  pivot  row  as  one 
with  largest  absolute 
element  in  pivot  column 


ABS  (pivct,  element)  > EPSPIV 


Perform  pivot 
operation 


Label  pivot  column 
tenfflorarily  inadmissible 


Pivo^ 

performed/ 


Any  candidate  columns  left 


CALL  DEGEW 


[Basis  Degenerate 


RETURN 


Figure  5.  Subroutine  CONSBS 
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The  subroutine  operates  in  2 modes.  In  mode  1 CONSBS  will  choose 
pivot  columns  from  whatever  candidate  ?ist  was  input  to  it.  If  a basis 
inverse  could  not  be  constructed  from  columns  in  this  candidate  list, 
or  if  the  original  candidate  list  included  all  variables,  the  mode 
indicator  is  set  to  2,  and  CONSBS  will  choose  pivot  columns  from  the 
list  of  all  admissible  columns.  A column  is  admissible  if  it  is  not 
scheduled  to  leave  the  basis  and  if  it  has  not  yet  been  pivoted  in. 

For  simplicity,  the  mode  logic  is  not  shown  in  Figure  5 . 

In  block  2,  candidate  coltimns  are  chosen  as  those  which  are 
admissible  and  whose  variables  are  farthest  from  their  nearest  bound 
(to  try  to  reduce  the  number  of  basis  changes  required) . A maximum  of 
five  are  chosen,  and  variables  very  close  to  their  nearest  bound  are 
ignored.  If  no  candidates  can  be  found  and  we  are  in  mode  1,  the  mode 
is  set  to  2,  the  candidate  list  becomes  the  set  of  all  admissible 
columns,  and  we  return  to  point^.  If  we  are  in  mode  2,  a variable 
at  a bound  is  chosen  to  enter  the  basis  (block  7 ) . The  basis  is  labeled 
degenerate.  A pivot  row  is  chosen  in  the  same  way  as  when  the  basis  is 
nondegenerate,  so  we  proceed  to  discuss  that  case. 

The  logic  in  blocks  5 and  4 represents  complete  pivoting  within 
the  set  of  candidate  columns,  i.e.,  the  entire  submatrix  is  examined 
to  find  the  largest  pivot  element.  In  block  5,  the  norm  is  used, 
taken  over  all  rows  not  yet  pivoted  in.  The  default  value  of  EPSPIV  in 
block  5 is  10 

If  the  basis  formed  is  degenerate,  subroutine  DEGEN  it  called  in 
block  7 to  compute  a search  direction  which  will  decrease  the  objective 
without  immediately  violating  ary  basic  variable  bounds.  A flag  is  set 


We  thank  Professor  Jean  Abadie  for  suggesting  that  we  design  CONSBS  in 
this  way. 


to  bypass  the  call  to  DIREC  in  subroutine  GRG.  DEGEN  is  discussed 
in  the  next  section. 

Earlier  veisions  of  CONSBS  employed  partial  pivoting,  i.e.,  the 
pivot  row  was  chosen  in  sequential  order  1,  2,  . . . , NB,  while  the  pivot 
column  was  chosen  as  the  column  selected  by  block  5 with  largest 
absolute  element  in  that  row.  This  was  replaced  by  complete  pivoting 
>dien  it  failed  on  some  test  problems,  encountering  rows  where  all 
elements  were  too  small,  and  where  a nonsequential  order  of  choosing 
rows  led  to  acceptable  pivots. 

A recent  enhancement  to  CONSBS  consists  of  additional  logic  just 

prior  to  entering  block  6.  At  this  point,  all  admissible  pivot  elements 

corresponding  to  variables  not  at  bounds  have  absolute  values  smaller 

than  EPSPIV  (default  value  10**^).  The  new  logic  tests  the  largest 

admissible  pivot  element}  if  its  absolute  value  is  larger  than 
-2 

10  * EPSPIV,  we  choose  it  as  the  pivot  element  rather  than  accepting 

a variable  at  a bound.  The  motivation  for  this  is  that  pivoting  on  an 
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element  larger  than  10  is  not  likely  to  cause  severe  problems, 
especially  if  it  is  the  largest  element  available,  and  if  the  alternative 
is  introducing  a variable  at  a bound  into  the  basis.  This  new  logic 
has  permitted  solution  of  a few  test  problems  which  could  not  be  solved 
previously.  The  previous  failures  were  caused  by  very  small  decreases 
in  objective  value,  along  (unit  vector)  search  directions  generated 
by  subroutine  DEGt-N.  With  the  new  logic,  DEGEN  is  not  called,  and  the 
search  directions  generated  by  DIREC  lead  to  the  optimum. 
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5.6  Subroutine  DEGEN 

Subroutine  DEGEN  is  called  by  CONSBS  when  the  basis  constructed 
is  degenerate,  i.e.,  has  one  or  more  basic  variables  at  bounds.  In  this 
case,  the  search  direction,  d,  produced  by  DIBEC  may  cause  some  of  these 
variables  to  violate  a botind  immediately,!,  e. , d may  nob  be  a feasible 
direction.  DEGEN  computes  a useable  feasible  direction  or  proves  that 
the  current  point  satisfies  the  Kuhn-Tucker  conditions. 

Let  L and  U be  the  index  sets  of  basic  variables  at  lower  and 
upper  bounds  respectively,  A direction  d is  feasible  if  the  following 
conditions  are  satisfied; 


3v  + Nd  = 0 


V.  > 0,  i I L 
1 ’ 


V,  < 0,  i € U 


Here  B is  the  basis,  and  N contains  the  nonbasic  Jacobian  columns. 

The  vector  v is  the  tangent  vector  of  equation  (9),  and  is  the  vector 

of  directional  derivatives  of  y in  the  direction  d.  Conditions  (12j-(15) 
require  that  basic  variables  at  ].ower  bound  increase,  and  conversely 

for  variables  at  upper  bound.  Strict  inequality  is  required  in  (12) 
and  (15)  to  ensure  that  v^  remains  non- negative  for  a small  movement 

in  the  direction  d.  Our  code  relaxes  conditions  (12)-(15)  to 

-S  -7 

Vj^  > -e,  i £ L and  < g,  i £ U,  with  g in  the  range  10  to  10  . 

This  accepts  a small  amount  of  infeasbility,  allowing  for  numerical 
error. 


DEGEN  tegins  by  checking  if  the  direction  d produced  by 
DIREC  is  feasible.  If  so  it  is  used.  Otherwise  the  column  of  N 
whose  reduced  gradient  yields  the  largest  rate  of  improvement  in  F 
is  found — if  none  yield  improvement,  the  Kuhn-Tucker  conditions  are 
satisfied  and  the  current  point  is  accepted  as  optimal.  The  direction 
d is  set  to  the  corresponding  unit  vector  and  if  v (equal  to 
times  the  column  chosen)  passes  the  feasibility  test,  d is  accepted. 
Otherwise  the  largest  absolute  element  of  v in  rows  with  indices 
in  L or  U is  found.  If  it  is  large  enough,  it  is  used  as  a pivot 
element,  a pivot  operation  is  performed  in  (ll),  and  the  process  begins 
again  with  the  new  basis  B.  If  no  basis  is  repeated  this  process 
must  terminate  finitely,  either  with  a useable  feasible  direction  or 
with  the  optimality  test  met.  In  the  3 or  4 degenerate  problems 
solved  thus  far,  calls  to  DEGEN  have  required  at  most  2 or  3 pivots. 

The  test  referred  to  above  for  a pivot  element  being  large 

-3 

enough  initially  uses  a tolerance  of  10  If  a potential  pivot 
element  is  less  than  this,  that  column  is  labelled  temporarily  inadmissible. 
If  all  improving  columns  become  inadmissible,  the  pivot  tolerance  is 
reduced  to  the  same  e value  used  to  test  for  feasibility,  and  the 
inadmissible  column  with  largest  pivot  element  is  entered  into  the  basis. 


6.  Computational  Experiments 

6.1.  Comparison  with  Intex'iur  Penalty  Methods 

Seven  test  problems  were  solved  by  the  GRG  code  and  by  the 
interior  penalty  code  described  in  [ 21] . Problem  characteristics 
are  shown  in  table  2 


Problem 

Source  or  Nature 
of  Problem 

No.  of 
Variables 

No.  of 
Equality 
Constraints 

No.  of 
Inequality 
Constraints 

1 

Quadratic  Objective 
and  Constraints 

4 

0 

5 

2 

No.  11  of  [4] 

5 

0 

3 

3 

No.  5 of  [21] 

8 

0 

23 

4 

No.  18  of  [4] 

15 

0 

5 

5 

No.  6 of  [21] 

17 

10 

8 

6 

No,  10  of  [4] 

5 

0 

10 

7 

Alkylation  Problem 
from  [27] 

7 

0 

14 

TABLE  2 


Characteristics  of  Test  Problems 


J ->  ’t 
:4^:'  *?■ 


'In  solving  problem  5 by  the  interior  penalty  code,  the  equality 
constraints  were  used  to  eliminate  10  variables,  leaving  an  inequality 
constrained  problem;  All  problems  were  solved  on  the  UNIVAC  1108  at 
Case  Western  Reserve  University.  Results  are  shown  in  Table  3. 


Statistic 

Penalty 

GRG 

Reduction  Factor 
Penalty/ORG 

One  Dimensional 
Searches 

^93 

9!^ 

5.3  - 

Function  Calls 

5773 

1121; 

Gradient  Calls 

559 

99 

Equivalent 
Function  Calls* 

86k3 

2023 

li.5 

% 


TABLE  3*  Ccanparison  of  GRG  and  Interior  Penalty  Codes 


Equivalent  Function  Calls  s Function  Calls  + N* Gradient  Calls 

where  N = No.  of  variables. 


I 

I 

I 

i 

I 

I 

i 

& 

$ 
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GRG  required  far  fewer  one-dimensional  searches,  function  evaluations 


and  gradient  evaluations.  While  some  of  this  reduction  was  offset  hy 


the  requirement  of  matrix  inversion  and  solution  of  nonlinear  equations 


in  GRG,  GRG  produced  more  accurate  solutions  for  most  of  the  prohlems. 


Computation  times  were  a few  seconds  or  less,  and  differences  in 
computation  times  (of  the  order  of  tenths  of  a second)  could  not  he 


estimated  accurately  due  to  the  masking  effects  of  m-iltiprogramming. 


6.2  Solution  of  Himmelhlau  Test  Problems 


The  first  twenty-four  problems  specified  in  Appendix  A of  reference 


[U],  which  includes  all  of  the  problems  with  equality  or  inequality  con- 


straints or  element  value  bounds , have  also  been  solved  with  GRG  . 


Several  were  solved  with  more  than  one  starting  point  and  one  (number 


22)  with  four  sets  of  parameter  values.  Table  shows  the  results  of 
solving  these  problems  on  an  IBM  57o/ll«-5  at  Cleveland  State  University. 


In  this  table,  the  Newton  Average  is  the  total  number  of 


iterations  of  the  quasi-Newton  method  (see  section  divided  by  oh 


number  of  times  solution  of  a nonlinear  system  was  attempted,  i.e.. 


the  number  of  calls  to  subroutine  NEWTON.  The  Colville  standard  time 


is  obtained  by  dividing  the  execution  times  by  77.85,  the  time  in 


seconds  required  to  rvin  the  Colville  standard  timing  program  (see  [4]) 
on  the  370/145.  These  numbers  provide  some  means  (admittedly  imperfect) 


for  other  investigators,  using  different  computers,  to  compare  results. 


We  thank  Professor  David  Himmelblau  for  providing  us  with  a card  deck 
for  these  problems. 


1 

i 


•4«i* 


The  current  version  of  GRG  solved  all  hut  one  of  these  prohlans 

(number  6),  in  the  sense  that  at  least  a local  minimum  was  found.*  In 

all  but  two  of  the  problems  (6  eind  14)  the  final  ob;jective  values  attained 

by  GRG  starting  with  the  initial  points  specif iec’  in  [4]  either  matched 

the  solutions  specified  in  [4],. bo  at  least  one  part  in  one  thousand, 

or  were  better  than  the  solutions  given  in  [4].  ' ■ 

In  problem  number  6,  using  the  starting  point  specified  in  [4], 

GRG  reached  a point  with  function  value  -1&65.9&  con^ared  to 

-1910.361  given  in  [4].  However  the  constraints  were  satisfied  within 
12 

1 part  in  10  using  GRG  but  only  within  1 part  in  20  for  the  solutions 
given  i/  [4].  Using  a different  starting  point  (x  = O)  GRG  does  reach 
an  objective  value  of  -1910.22  with  constraints  satisfied  to  1 part" 


Problem  number  14  '*  contained  a myriad  of  local  optima  of  many 

different  values."  Depending  on  the  starting  point  used,  GRG  generated 

several  'different  solutions.  Using  the  initial  feasible  starting  point 

in  [4],  GRG  reached  a minimum  of  261,550  compared  to  250800  in  [4]. 

From  the  nonfeasible  starting  point  GRG  attains  a minimum  01  260,508. 

Starting  from  x^  = 0,  i 4,  Xj^  = 2000  GRG  reaches  a value  251,786. 

These  results  vere  attained  using  the  default  values  for  all 

peirameters  in  GRG,  finite  difference  approximations  for  derivatives, 

quadratic  extrapolation  for  basic  variables,  and  double  precision 

floating  point  computations.  We  also  examined  the  .ffects  of  changing 

/ \ -4  -3 

the  EPSTOP  parameter  (see  section  5.1)  from  10  to  10  . This  reduced 
run  times  for  the  Himmelblau  problems;  only  slightly  for  most  problems, 
up  to  23%  for  some.  Two  problems,  however,  stopped  significantly  short 
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Function  and  Equivalent  One  Colville 
Problem  Best  function  Best  function  gradient  function  dJmensional  Hewton  Execution  standard 
Number  N,M, NEQ  value  reported  value  with  GRG  evaluations  evaluations  searches  average  time  (sec)  Time 
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Results  of  Solving  Hinnelbliiu  Problems 
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of  optimality.  This  suggests  a strategy  in  which  a loose  stopping 
tolerance  is  met,  the  tolerance  is  decreased,  and  the  procedure  repeated 
until  no  significant  change  in  objective  or  x values  is  detected. 

This  had  not  yet  been  implemented. 

Table  5 shows  the  results  of  using  analytic  derivatives  for 
problems  6 and  2J>.  The  constraints  for  these  problems  were  linear  and 
h‘,nce  analytic  derivatives  were  easily  obtained.  For  all  practical 
purposes  the  number  of  iterations,  etc.  was  independent  of  the  gradient 
computation.  Run  times  are  dramatically  shorter  when  the  gradients 
are  ajialytically  computed.  Even  greater  reductions  could  have  beer 
obtained  by  coding  the  partial  derivative  subroutines  PARSH  to  evaluate 
the  constraint  gradients  (which  are  constant)  once  only. 

In  order  to  obtain  some  measure  of  the  impact  of  performing  the 
floating  point  computations  in  double  precision,  the  Colville  program 
was  modified  to  also  run  in  this  mode.  As  a result  of  this  change,  the 
average  execution  time  increased  from  TT.85  to  109.89  seconds  or  41^^. 
Earlier  versions  of  GRG  used  only  single  precision  computations  and 
occasionally  failed  to  locate  a relative  minimum.  Again,  in  the  interest 
of  robustness,  execution  speed  has  been  sacrificed  in  the  current  version. 


Problem 

Number 


Time  with 
Finite  biff.  Derv. 


Time  with 
Analytic  Deriv. 


Percent 

Increase 


6 

227.26 

66.08 

244 

23 

570.3? 

3.03.64 

451 

Totals 

797.73 

169.72 

368 

TABLE  5.  Analytic  versus  Finite  Difference  Derivatives 
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6.3*  Comparison  with  Two  Other  NLP  Codes 

The  test  problems  of  Table  4 have  also  been  solved*  by 
two  other  codes— Abadie’s  1975  GRBCt  (a  GRG  code)  and  a code  developed 
by  Newell  and  Himmelblau  [22]  implementing  an  "exact"  exterior  penalty 
method,  or  "Method  of  Multipliers"— as  well  as  by  an  earlier  version 
of  the  code  described  here.  All  problems  were  solved  on  the  CDC-6600 
at  the  University  of  Texas  at  Austin.  Computation  times  (in  seconds) 
are  shown  in  Table  6,  where  F indicates  a failure  to  solve  the 

i 

problem  and  a dash  indicates  that  no  attempt  to  solve  was  made. 

The  totals  row  is  the  sum  of  times  for  those  17  problems  which  were 
solved  by  all  3 codes. 

The  GRG  code  described  here  compares  well,  especially  since  the 
current  version  has  solved  all  but  problem  21.  In  addition,,  our  code 
used  finite  difference  approximations  to  derivatives,  vrtiile  the 
other  two  computed  them  analytically.  Solution  times  are  con- 
sistently small,  and  the  code  does  well  on  the  larger  problems  l8  and  20, 
Its  advantage  in  total  time  over  the  other  two  codes  is  due  to 
a distinct  superiority  on  a few  problems.  In  fact,  the  Newell  code 
was  fastest  in  10  problems,  ours  in  5 and  GREG  in  2. 


* 

We  are  grateful  to  Professor  David  Himmelblau  for  running 
these  tests. 


Problem 

Abadie 

GREG  Lasdon-Waren 

Newell  method 

number 

1975 

GRG  Cod6 

of  multipliers 

1 

.11 

.09 

.01 

2 

.21 

VO 

• 

.02 

3 

.18 

.15 

.07 

4 

1.07 

00 

• 

.88 

5 

.18 

.16 

.01 

7 

1.70 

1.58 

.45 

8 

.84 

.77 

.16 

9 

27.59 

1.55 

5.11 

10 

.31 

.41 

.16 

Ilf 

— 

.12 

.64 

llnf 

.21 

— 

.62 

12 

2.58 

F 

11.7 

13 

.17 

.11 

— 

14 

.56 

.46 

.58 

15 

.52 

.64 

.22 

16 

P 

F 

.19 

17 

.21 

.29 

.08 

18 

3.51 

2.36 

13.15 

I9nf 

.97 

F 

3.96 

20 

16.52 

1.00 

6.06 

21 

50.28 

F 

6.9'f 

22A 

.16 

.18 

.85 

220 

.29 

.25 

1.04 

24 

.15 

.09 

.04 

TOTAIS 

54.22 

10.92 

28.89 

Computation  Times  (CD6600 

TABLE  6 

Sec.)  for  Himmelblau  Problems 
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dratic  Ejctraiaolation  of  Basic  Variables 


As  discussed  in  Section  l4->  either  linear  or  quadratic  extrapolation 
can  be  used  in  subroutine  REDOBJ  to  estimate  initial  values  for  the  basic 
variables.  The  quadratic  estimates  take  more  time  and  storage  to  compute 
but  should  reduce  the  number  of  Newton  iterations  required.  To  assess 
the  relative  merits  of  these  two  options  all  test  problems  of  section  6.3 
requiring  any  Newton  iterations  were  solved  both  ways.  Only  for  problem 
number  l8  did  quadratic  extrapolation  take  more  equivalent  function 
evaluations  although  the  number  of  Newton  iterations  remained  the  same. 

In  all  other  problems  quadratic  extrapolation  performed  as  well  as  or 
better  than  linear  extrapolation.  On  an  overall  basis,  the  nuniber  of 
equivalent  function  evalviations  decreased  by  Newton  iterations 
decreased  l6j^  and  execution  time  decreased  by  5/^.  In  another  series  of 
tests,  the  7 problems  of  section  6.1,  plus  2 others  of  15  and  7 
variables  were  solved.  All  are  highly  nonlinear.  Using  quadratic 
versus  linear  extrapolation  yielded  the  following  cumulative  results: 
one -dimensional  searchec  increased  l9j^,  Newton  iterations  decreased 
and  equivalent  function  evaluations  decreased  20^^.  We  conclude  that 
quadratic  extrapolation  can  reduce  function  evaluations  significantly 
(especially  for  highly  nonlinear  problems ) , and  should  reduce  computation 
time  appreciably  when  GCOMP  calls  account  for  a large  fraction  of  the 


overall  computational  effort. 
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7.  Conclusions  and  Future  Work 


The  results  presented  here  indicate  that  GRG,  as  Ijapleaented 
in  this  code,  is  an  efficient  and  reliable  way  to  solve  small  to 
moderate  size  NLP  problems.  Abadie's  GRG  codes  have  also  received 
extensive  testing  (e.g.  in  [U  3 and  [2J]).  Our  results  support 
the  conclusions  reached  by  these  authors,  namely  that  GRG  is  a leading 
contender  among  NLP  algorithms.  Since  small,  dense,  highly  nonlinear 
prooleuio  have  oiwost  infinite  ic  unlikely  that  any 

one  algorithm  will  be  best  foi*  ail  pror;iems.  Modern  penalty 
"augmented  Lagrangian"  algorithms  (see  ['2:4  ])  are  among  the  chief 
competitors,  and  conparative  tests  are  planiiod  in  the  near  future. 

The  best  test  of  a code  is  to  attempt  to  solve  a wide 
variety  of  real  problems.  To  facilitate  such  testing,  we  are  offering 
the  code  to  persons  in  nonprofit  research  institutions  for  a preparation 
charge  of  ^55  . Both  user  and  system  documentation  will  be  pro- 

vided. Interested  parties  are  urged  to  contact  the  authors. 

Future  work  will  attenpt  to  extend  GRG  to  large,  sparse, 
"mostlj'-  linear"  NLP  problems  of  general  (nonseparable)  nature.  A 
number  of  large  problems  of  special  form,  e.g.,  electric  power 
distribution  [ 25]  and  hydroelectric  scheduling  ' 26] , have  already 
been  successfully  solved  by  vaxiants  of  GRG.  To  our  knowledge,  no 
general  pvirpose  GRG  code  capable  of  solving  large  problems  exists 
today.  We  believe  that  few  new  ideas  are  required  to  construct  such 
a code;  a union  of  existing  LP  technology  and  some  of  the  ideas  pre- 
sented here  should  suffice.  Work  on  this  project  has  recently  begun. 
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